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Abstract 

We evaluate with simulated data a new type of sample variance for the characterization of 
frequency stability. The new statistic (referred to as TOTALVAR and its square root TOTALDEV) 
is a better predictor of long-term frequency variations than the present sample Allan deviation. The 
statistical model uses the assumption that a time series of phase or frequency differences is wrapped 
(periodic) with overall frequency difference removed. We find that the variability at long averaging 
times is reduced considerably for the five models of power-law noise commonly encountered with 
frequency standards and oscillators. 


INTRODUCTION 

The most common method of quantifying frequency stability between oscillators is to evaluate 
the RMS of the fractional frequency changes vs. averaging time r, dubbed the Allan deviationOl. 
For any sequence of average fractional frequency deviations {&}, the widely used quantity & y (r ) 
is ideally suited as a reliable, easily interpretable statistic for the characterization of frequency 
stability for common kinds of FM oscillator noisef 2 ’ 3 '. 

There is a considerable literature on various methods and candidate statistics for the charac- 
terization of relative oscillator frequency stability. Suffice it to say that for a given system and 
noise, a statistic can be constructed to be nearly optimum. A single, unified approach will 
have its compromises. The Allan deviation, however, has a remarkable range of applicability 
in quantifying frequency and phase stability. This is because as a function of averaging time 
r, it is particularly well-suited in identifying the model of the trend in frequency stability or 
what is called the underlying “power-law” over a range of r values. The power-law is the 
slope on a typical log-log cr y (T) plot, and p y (r) is suitably the RMS prediction error of fre- 
quency stability. Predicting the long-term stability of a frequency reference rests ultimately on 
predicting (correctly identifying) its power-law behavior. For an estimate of stability longer or 
different than the measurement at hand, simply extrapolate from or directly apply an expected 
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trend (power-law slope). Lastly, we can estimate the evolution of a squared phase error a^ 
proportional to t 2 times the modified Allan variance for a uniquely identified power-law slope. 
This is the time variance or TVAR! 4 1. 

A long-standing problem is that the best statistic, the two-sample Allan deviation, has rather 
poor confidence at longer and longer r-values, where confidence is often needed most. A 
new statistic has been developed which retains the intuitive simplicity of the RMS fractional 
frequency changes (Allan deviation) and which has improved confidence at long-term averaging 
times! 5 !. The model for the new statistic uses the assumption that a time series of phase 
or frequency differences is wrapped (periodic) with overall frequency difference removed!*!. 
Figure 1 illustrates the procedure. This variance (thus its square root) reduces estimation 
errors universally seen in previous treatments, thereby providing a better estimate of frequency 
stability for measurement times longer than say 20% of the data length. 

We compare the response of the new statistic (as a variance) to the traditional Allan variance 
by simulation of the five models of power-law noises commonly encountered with oscillators 
and frequency standards. Results show that the new variance shows a promise for greatly 
reduced variability hence uncertainty compared to the traditional Allan variance. 


DISCUSSION 

The sample Allan deviation < 7 y(r) and modo-y(r) are square roots of two types of tau-domain 
sample variances (AVAR and MVAR)! 1 * 7 1. They are recommended statistics in quantifying 
frequency stability between oscillators. In certain situations their responses have high variability 
at long averaging times r, as indicated by traditional simulation studies using common noise types, 
because the traditional sample Allan statistics are time-shift dependent. Therefore these statistics 
have degraded confidence at long averaging times. The method of complex demodulation 
motivates another statistic which is an improved sample variance for the characterization of 
frequency stability! 5 !. For average fractional frequency fluctuations {y*.'} = yi,...,yjv-i with 
overall frequency difference removed, this sample variance is given by: 


1 N ~ l 
&totali T ) = ~jy _ | 5Z 
j 


| M— 1 

2(m i) 5Z (j/fc+u j y/c<j) 


(i) 


where {&k'j} — j7j+ii37j+2»-*-, VN- i, Si, 1 / 2 , •••, Vj are spaced by r 0 and {y' k } is therefore 
wrapped and re-indexed by j. Series {yk,j} - with unprimed k - are averages implied over 
r = mr 0 . Hence, as with traditional AVAR (and MVAR), the new sample variance b\ otal is 
implicitly dependent on dimensionless quantity m, a scale parameter which determines r and 
which for efficiency can be limited to rational powers of 2, that is, T = m, i = 0, 1,2,3, .. . . 

Measurements of relative phase differences {x*;/} are preferred to average frequency (y fc /j a> 
in Equation (1). We have k' = 1, 2, 3, . . . , N and separated by interval r 0 and overall frequency 
difference removed; therefore x\ = x N . Furthermore {x fc /} is wrapped and assumed periodic; 
hence x\ = £/v-n and we eliminate the increment x^ to xn+i to avoid bias (see Figure 1). We 
have 
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where the argument in the brackets “[*]” has stride k' — m and is time-shifted by jtq and 
averaged for all N - 1 possible shifts. This notation centers the second -difference operation 
(argument in parenthesis) at k' with a span of ±m which seems more intuitive especially 
considering the wrap procedure. 


STATISTICS COMPARED 

The primary reasons for using <7 y (r) are that it is well-known, it is simple to calculate, it is 
the most efficient estimator for FM noise, and it has a unique value for all r. The primary 
disadvantage of using a y (r) is that the results can be too conservative, sometimes very optimistic 
at the long r-values. It can take much longer than the longest reportable r-values (often orders 
of magnitude longer) to accurately quantify the underlying low-frequency variations between 
the frequency standards being evaluated! 3 !. For example, quantifying the frequency stability at, 
say, r equals two weeks often requires no less than two months of actual measurement time. 

We compare the new sample variance o 2 0tal {T) (also called TOTALVAR) to traditional AVAR 
^|(r) using simulation studies of five common integer power-law noise types. These noise types 
are white PM, flicker PM, white FM, flicker FM, and random walk FM. A version of £ 2 ota/ (T) 
called mod<3f ota/ (r) exists for MVAR; however since our present emphasis is on confidence at 
long r-values, AVAR is of interest. MVAR’s advantage is in distinguishing white PM from 
flicker PM which usually are associated with short r-values. MVAR has no advantage for flicker 
PM and beyond, which occur at long r-values. Furthermore, a chief disadvantage to MVAR is 
that it only extends to 1/3 the total data length, whereas AVAR extends to 1/2 the same length. 

For highly divergent noise types, the new statistic is not expected to be unbiased! 8 ’ *1. However, 
this report indicates that the new statistic essentially estimates the same unbiased quantity as 
traditional AVAR for the five common integer power-law noise types but has better confidence 
than AVAR. 


GENERATION OF SIMULATED {4} DATA 

Most high level computer program languages can return random variables which we then order 
as a time series {a n }. The usual assumption is that variables are uncorrelated and normally 
(Gaussian) or uniformly distributed. Thus {o n } forms the basis for a white-noise-of-phase 
process which is characterized by a constant power spectral density, S a (f) <x / 0 . We build 
from {o„} the other four noise processes: flicker (oc / -1 ), random walk (cx f~ 2 ), flicker walk 
(oc f~ 3 ), and random run (oc f~ 4 ). The treatment of non-integer power law noise types has 
recently been explored H®!. We limit our simulations to the five common integer power laws. 

Random walk of phase (RWPM) is equivalent to white noise of frequency (WHFM) and is 
one integration (single summation) of {a n }. Random run of phase (RRPM) is random walk 
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in frequency (RWFM) and is two integrations (double summation) of {a„}. These operations 
are among the simplest autoregressive (AR) procedures. 

Flicker processes can be generated using an AR operation but must also include an (integrated) 
moving average (MA). The ARIMA model used in generating the five integer noise processes 
is adequately described by 


X n — <f>\X n -\ "f" 4>2Xn—2 “b 1 > (3) 

where a n is an input random variable and x n is an output. 

For flicker of phase (FLPM): 


4>i = 1.549, 

4>2 = 0.56, 

e = 0.88 

Flicker walk of phase is flicker of frequency (FLFM) and is one integration (single summation) 
of an FLPM series. 

As mentioned, random run of phase (or random walk FM, RWFM) could be adequately realized 
as only a double summation of a n which means <f>\ — 2 and = — 1, and 6 = 0 in Equation 
(3). Cleaner representations of RWFM are realized for 6 = y/3 — 2f n l. Thus we use: 


4 >\ = 2 , 

<t>2 = —1 

6 = v/3 — 2 = -0.268 

For the simulations here, some thought went into initializing each sequence to obtain a 
representation for the flicker and random run noise types, ai was chosen to be between 0 and 
1; x„_i, x n — 2 i and a n _j were derived from the end of previous simulations. 

In each of the noise types, the top of Figures 2 to 6 show plots of 100 calculations of a to tai{x) 
followed below by plots of 100 calculations of & v (t) from the same 100 simulations. At the 
bottom of each figure is a plot of the square root of the mean of 0 ^(t) derived from the 100 
simulations in order to see its agreement or disagreement with theory, that is, the theoretical 
square root of a mean of an infinite set. Flicker of phase (FLPM) is the only type which does 
not have a straight-line (log-log scale) theoretical slope owing to a logarithmic dependence on 
bandwidth. 
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WHITE PM (WHPM) AND FLICKER PM (FLPM) CASES 

For short r-values, we usually find noise modulation of the phase (not frequency) originating 
from noisy electronics not involved in the frequency-determining elements. White PM (WHPM) 
noise is broadband phase noise and has little to do with the resonance mechanism. Stages ot 
amplification are usually responsible for white PM noise. This noise can be kept very low with 
good amplifier design, hand-selected components, the addition of narrowband filtering at the 
output, or increasing, if feasible, the power of the primary frequency source. 

Flicker PM (FLPM) noise may relate to a physical resonance mechanism in an oscillator, but it 
usually is added by noisy electronics. This type of noise is common, even in the highest quality 
oscillators, because in order to bring the signal amplitude up to a usable level, amplifiers are 
used after the signal source. Flicker PM noise may be introduced in these stages. It may also 
be introduced in a frequency multiplier or frequency synthesizer. 

Figures 2(a) and 3(a) show 100 plots of calculations of the square root of a\ ' otal {r ) for 100 
simulations of white PM noise and flicker PM respectively. Equation (2) is used for these 
calculations and N=1024 for each simulation. Each of the simulation averages of two-sample 
variances at t = 1 is equal to one. Figures 2(b) and 3(b) are traditional square root of maximally 
overlapped &y(j) for the same 100 simulations. The bottom plot is the 100-simulation-total 
square-root of the mean of the sample Allan variances and shows excellent agreement with 
theory. The spread in the estimates is greater using AVAR instead of the new statistic af ot(W (r). 

White and flicker of PM both exhibit a r _1 slope in ct v (t) and hence a< 0 taf(f)- These noise 
types differ from the others in an important regard: their amplitudes are significantly affected 
by measurement (software and/or hardware) bandwidth! 3 * introduction]. Because of this, mod<r 2 (7) 
or modified Allan variance (MVAR) was invented (for analyzing phase data only, {xfc'}) to take 
full advantage of the 1/nro slope in the standard variance of {x*/} for white PM and ln ^ T< ^ 
slope for flicker PM. As mentioned earlier, we limit our present discussion to a comparison 
between £ t 2 ( t ) and o} otal {T). This is because the present interest is an improved confidence at 
long r-values where more dispersive noise types are encountered and ultimately limit accurate 
characterization of frequency stability. Again a significant disadvantage to MVAR is that a 
single longest reportable r-value is limited to 1/3 the total measurement time; 50% more time 
is required for equivalent results using MVAR vs. AVAR. It suffices to say, however, that for 
white PM and flicker PM, the improvement in confidence in the long term is dramatic using 
the new statistic ^ oial {r) as shown in Figures 2 and 3. 

6. WHITE FM (WHFM) CASE 

The cases of white FM, flicker FM, and random walk FM are of particular importance since they 
are physically traceable noise types encountered in virtually all precision frequency standards, 
and they often occur at long r-values. 

White FM noise (a y (r) oc r -1 / 2 ) is the type found in common passive-resonator frequency 
standards. These contain a slave oscillator, often quartz, which is locked to a resonance 
feature of another device which behaves as a high-Q filter. High quality cesium, rubidium, 
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and passive hydrogen standards have white FM noise characteristics l 12 '. 7 . Howe has previously 
presented results using white FM simulation that show that the new statistic TOTALVAR is an 
improved estimate of the mean-square frequency deviations between oscillators, particularly at 
long T-valuesl s l. Figure 4 reproduces those results for the comparison here. 

7. FLICKER FM (FLFM) CASE 

Flicker FM (< 7 v (t) oc t°) is a noise whose physical cause is not fully understood but may 
typically be related to the physical resonance mechanism of an active oscillator, the design or 
choice of parts used for the electronics, or environmental conditions! 12 !. Flicker FM noise is 
considered the quantum limit of resonance devices! 13 !. Flicker FM is common in the highest 
quality oscillators but may be masked by white FM or even white PM and flicker PM in lower 
quality oscillators. 

Figure 5(a) shows 100 plots of calculations of &totai( T ) for 100 simulations of flicker FM noise 
and Figure 5(b) is the same set of calculations using traditional square-root of maximally 
overlapped AVAR. The square root of the mean of the AVAR’S of the 100 simulations as shown 
in Figure 5(c) show a slight downward offset which can commonly occur at r = 512ro = T/2. 
Even though the power law is not exact, it is sufficient for the comparison of the spread in the 
responses between <7< 0 iai( r ) and traditional a y (r) Again, the new statistic is preferred since it 
is generally less susceptible to large variations at long r-values. 


RANDOM WALK FM (RWFM) CASE 

Of the five models of power-law noise types, random walk FM noise (ct v (t) oc t 1 / 2 ) is most 
difficult to measure since its power is concentrated mainly very close to the carrier. This 
translates to near DC when considering phase differences {xy} or average frequency differences 
{y*>}. Random walk FM usually relates to an oscillator’s physical environment. If random 
walk FM is a predominant noise type then mechanical shock, vibration, humidity, temperature, 
or other environmental effects may be causing “random” shifts in the carrier frequency! 14 * tsi 

Figure 6(a) and 6(b) are 100 plots of calculations of the square roots of <5f 0<ai (T) and <T y (r) 
respectively, for 100 simulations of random walk FM noise. Again, even though the simulated 
power-law is assumed to be not exact as interpreted from the square root of the mean of 
AVAR’S in Figure 6(c), the important point is the comparison of the spread between square 
roots of TOTALVAR and AVAR (Figures 6(a) and 6(b)). And again, the square root of 
TOTALVAR is preferred since the spread and skews are reduced at long r values. 


CONCLUSION 

We compare the response of the traditional sample Allan deviation a y (r) with a new similar 
sample statistic o to t a i{ r ) referred to as TOTALDEV (square root of TOTALVAR) for the five 
models of integer power-law noise types. These integer noise types are white PM, flicker PM, 
white FM, flicker FM, and random walk FM. Using traditional plots of sigma vs. tau and 100 
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simulations of each noise type, we find the variability in <5tota/( T H° ^> e ^ ess th an in 0 y(r) in 
all cases. As a result, we can expect a reduction in the actual measurement time involved to 
characterize the long-term frequency stability of a standard or oscillator. 
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Fig. I 


The new statistic (referred to as TOTALVAR and its square root) uses 
the model that a time series of phase difference x, are wrapped with 
period T and overall frequency difference removed The periodic 
assumption means that the data are circularly represented and the time- 
origin is no longer q, but is shiftable by j r 0 where r 0 is the minimum 
measurement interval. TOTALVAR is traditional AVAR averaged over 
N-l possible shifts. Removal of the overall frequency difference 
eliminates an end-match step by making x,=x N , hence x, = x Nft and we 
eliminate the increment x N to x N 77 to avoid bias We use 


***£ mx <) " "JTtE — 7~i E V'-w* 1 • 


where the argument in the brackets is traditional AVAR shifted by j. 
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Fig. 2 

Top(a): Square root of TOTALVAR calculated for 100 WHPM 

simulations with unit (two-sample) mein at r* 1. 

Middle(b): For comparison, traditional square root of maximally- 

overlapped AVAR calculated for the same 100 WHPM simulations as 
used at top for square root of TOTALVAR. 

Bottom(c): Square root of 100-total mean of maximally-overlapped 

AVAR'S, an indication of the desired result Dashed line is the 
theoretical mean of an infinite set. 
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Fig. 3 **«• * 

Top(a): Square root of TOT ALVAR calculated for 100 FLPM Top(a): Square root of TOT ALVAR calculated for 100 WHFM 

simulations with unit (two-sample) mean at r« 1. simulations with unit (two-sample) mean at r* L 

Mlddle(b): For comparison, traditional square root of maximally- Middled*): For comparison, traditional square root of maximally- 

overlapped AVAR calculated for the same 100 FLPM simulations as overlapped AVAR calculated for the same 100 WHFM simulations as 

used at top for square root of TOTAL VAR. used at top for square root of TOT ALVAR. 

BotliMn(c): Square root of 100-total mean of maximally -overlapped Bottom(c): Square root of 100-total mean of maximally-overlapped 

AVAR'S, an indication of the deaired result. AVAR'S, an indication of the desired result. Dashed line is the 

theoretical mean of an infinite set. 
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Fig. 5 

Top(t): Square root of TOTALVAR calculated for 100 FLFM 

simulations with unit (two-sample) mean *r*l. 

Middlefb): For comparison, traditional square root of maximally- 

overlapped AVAR calculated for the same 100 FLFM simulations as 
used at top for square root of TOTALVAR. 

Bottom(c): Square root of 100-total mean of maximally-overlapped 

AVAR’S, an indication of the desired result. Dashed line is the 
theoretical mean of an infinite set. 
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Fig. 6 

Top(a): Square root of TOTALVAR calculated for 100 RWFM 

simulations with unit (two-sample) mean at r= 1. 

Middleffa): For comparison, traditional square root of maximally - 

overlapped AVAR calculated for the same 100 RWFM simulations as 
used at top for square root of TOTALVAR. 

Bottom (c): Square root of 100-total mean of maximally-overlapped 

AVAR'S, an indication of the desired result. Dashed line is the 
theoretical mean of an infinite set. 
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Questions and Answers 


JOE WHITE (NRL): Dave, what happens when you have any periodic effects in the data 
that goes into this sample? Real world data, for instance, since they have diurnals and things 
like that, what does that do to the confidence of this type of thing where we’re wrapping it 
around on itself now, and these things no longer necessarily line up particularly at the ends? 

DAVE A. HOWE (NIST): Okay, well let me make a couple of comments about that, Joe. 
One is that the simulations assume that there is no periodicity in the data. The Allan Statistic 
is ideally suited for stochastic processes, but if there is a diurnal, then that’s a problem for the 
Allan Statistic. 

On that, I would expect the results to be similar; that is, if there’s a periodicity in the data, 
then once again, as you go to longer and longer averaging times, one would expect some would 
expect some nulls to occur. But actually thinking about it, maybe not. Because since this 
variance is a time shift invariant variance, then I think, though, it will just show the high value 
throughout the run. So, that’s a good question. 

JOE WHITE (NRL): Let me follow up with one more that’s near and dear to my heart: 
Are you ready to talk about what the error bars ought to be on this kind of data when you 
do this sort of approach? You know, traditionally they run something like one over the square 
root of N as a rule of thumb. What would you say here? 

DAVE A. HOWE (NIST): I’m not in a position to talk about that. 

DR. GERNOT WINKLER (USNO, RETIRED): I think the old question is very closely 
related to the problem of how much systematics, how many systematics, do you first subtract 
before you go into the statistical analysis. I remember that we discussed it about 20 years ago, 
why this sudden drop in sigma tau. And Jim Barnes, in fact, at that time said that this is 
inevitable as soon as you subtract a systematic part. You remove, of course, the low frequency 
part; and therefore, the sigma tau has to drop at that point. 

Now when you have periodic content, again the description is that before you go into statistical 
evaluation, you must remove systematics. But how much, where you put that dividing line, 
whether you stop at the linear substraction or a quadratic or a simple sinusoid, that is, of 
course, the problem and the real question. 

DAVE A. HOWE (NIST): Well, I understand. Typically, we use the model of just drift and 
linear rate. That’s as far as we go. We assume the rest of it is the residual noise. 

I do appreciate the question, I’m not sure I can shed any more light on that. There were no 
systematics in this data. There was no drift introduced. 
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